---
title: "Guardrails"
description: "Screen agent inputs and review outputs with guardrails"
icon: "shield-halved"
---

Guardrails validate agent inputs and outputs using the Agents SDK. They act as checkpoints that screen incoming messages before processing and review agent responses before delivery, ensuring agents stay on-topic, avoid sensitive content, and follow format requirements.

## When to Use Guardrails

Guardrails solve validation challenges that go beyond simple field checks:

<CardGroup cols={2}>
  <Card title="Content Filtering" icon="filter">
    Block off-topic questions, inappropriate language, or sensitive information leaks
  </Card>

  <Card title="Format Enforcement" icon="align-left">
    Require specific response structures, prefixes, or formatting rules
  </Card>

  <Card title="Compliance" icon="file-shield">
    Enforce regulatory requirements, privacy policies, or business rules
  </Card>

  <Card title="Security" icon="lock">
    Prevent prompt injection, data exfiltration, or unauthorized actions
  </Card>
</CardGroup>

<Note>
For validating **tool inputs** (e.g., checking field values, data types, ranges), use [Pydantic validators](/core-framework/tools/custom-tools/validation) instead. Guardrails are for **agent-level** validation.
</Note>

## Practical Examples

### Example 1: Filtering Off-Topic Questions

Use input guardrails to keep agents focused on their domain. This example delegates relevance decisions to a judge agent:

```python
from agency_swarm import Agency, Agent, GuardrailFunctionOutput, RunContextWrapper, input_guardrail
from agents.model_settings import ModelSettings
from pydantic import BaseModel

class RelevanceDecision(BaseModel):
    is_relevant: bool
    reason: str

guardrail_agent = Agent(
    name="GuardrailAgent",
    instructions=(
        "You screen incoming messages for a customer-support assistant. "
        "Treat questions about account access, billing, and troubleshooting as relevant. "
        "Flag any other unrelated requests as irrelevant."
    ),
    model="gpt-5-nano",
    model_settings=ModelSettings(reasoning_effort="minimal"),
    output_type=RelevanceDecision,
)

@input_guardrail
async def require_support_topic(
    context: RunContextWrapper, agent: Agent, user_input: str | list[str]
) -> GuardrailFunctionOutput:
    """Forward the decision to the guardrail agent."""
    candidate = user_input if isinstance(user_input, str) else "\\n".join(user_input)
    guardrail_result = await guardrail_agent.get_response(candidate, context=context.context)
    decision = RelevanceDecision.model_validate(guardrail_result.final_output)

    if not decision.is_relevant:
        return GuardrailFunctionOutput(
            output_info="Only support questions are allowed. Ask about billing, account access, or troubleshooting.",
            tripwire_triggered=True,
        )
    return GuardrailFunctionOutput(output_info="", tripwire_triggered=False)

support_agent = Agent(
    name="CustomerSupportAgent",
    instructions="You help customers resolve account, billing, and troubleshooting issues.",
    model="gpt-5-mini",
    input_guardrails=[require_support_topic],
    throw_input_guardrail_error=False,  # Friendly mode: guidance returned as assistant message
)
```

See the full example at [`examples/guardrails_input.py`](https://github.com/VRSEN/agency-swarm/blob/main/examples/guardrails_input.py).

### Example 2: Preventing Sensitive Information Leaks

Use output guardrails to review responses before delivery. This example prevents agents from sharing email addresses:

```python
from agency_swarm import Agency, Agent, GuardrailFunctionOutput, RunContextWrapper, output_guardrail

@output_guardrail(name="ForbidSensitiveEmail")
async def forbid_sensitive_email(
    context: RunContextWrapper, agent: Agent, response_text: str
) -> GuardrailFunctionOutput:
    """Reject responses that include personal email addresses."""
    if "@" in response_text:
        return GuardrailFunctionOutput(
            output_info="Do not share email addresses. Offer to connect via the support portal instead.",
            tripwire_triggered=True,
        )
    return GuardrailFunctionOutput(output_info="", tripwire_triggered=False)

support_agent = Agent(
    name="SupportPilot",
    instructions="You handle customer support. Official email: support@example.com.",
    model="gpt-5",
    output_guardrails=[forbid_sensitive_email],
    validation_attempts=1,  # Agent gets 1 retry to fix the response
)
```

See the full example at [`examples/guardrails_output.py`](https://github.com/VRSEN/agency-swarm/blob/main/examples/guardrails_output.py).

### Example 3: Simple Format Enforcement

Require responses to follow a specific format:

```python
@output_guardrail(name="RequireJSONFormat")
async def require_json_format(
    context: RunContextWrapper, agent: Agent, response_text: str
) -> GuardrailFunctionOutput:
    """Ensure responses are valid JSON."""
    import json
    try:
        json.loads(response_text)
        return GuardrailFunctionOutput(output_info="", tripwire_triggered=False)
    except json.JSONDecodeError:
        return GuardrailFunctionOutput(
            output_info="Response must be valid JSON. Wrap your response in curly braces.",
            tripwire_triggered=True,
        )
```

## Output Guardrails

Output guardrails validate agent responses **before** they reach users or other agents. When a guardrail trips, the agent receives feedback and retries.

### Function Signature

Each output guardrail receives three parameters:

```python
@output_guardrail
async def my_output_guardrail(
    context: RunContextWrapper,
    agent: Agent,
    response_text: str | Type[BaseModel]
) -> GuardrailFunctionOutput:
    """Validate agent output."""
    # Your validation logic here
    pass
```

**Parameters:**
- `context`: Run context wrapper with access to shared state
- `agent`: The Agent instance generating the response
- `response_text`: The agent's response as a string, or a Pydantic model if `output_type` is specified

**Return:**
- `GuardrailFunctionOutput` with:
  - `tripwire_triggered` (bool): `True` if validation failed
  - `output_info` (str): Feedback message sent to the agent when `tripwire_triggered=True`

### Basic Output Guardrail

```python
from agency_swarm import output_guardrail, GuardrailFunctionOutput, RunContextWrapper, Agent

@output_guardrail
async def response_content_guardrail(
    context: RunContextWrapper, agent: Agent, response_text: str
) -> GuardrailFunctionOutput:
    """Reject responses containing inappropriate content."""
    tripwire_triggered = False
    output_info = ""

    if "bad word" in response_text.lower():
        tripwire_triggered = True
        output_info = "Please avoid using inappropriate language."

    return GuardrailFunctionOutput(
        output_info=output_info,
        tripwire_triggered=tripwire_triggered,
    )

agent = Agent(
    name="CustomerSupportAgent",
    instructions="You are a helpful customer support agent.",
    output_guardrails=[response_content_guardrail],
)
```

## Output Guardrail Retry Flow

When an output guardrail trips, the agent gets multiple chances to fix its response. The `validation_attempts` parameter controls this behavior.

### How Retry Works

<Steps>
  <Step title="Agent generates response">
    The agent produces its initial response
  </Step>
  <Step title="Output guardrail checks response">
    Each output guardrail validates the response
  </Step>
  <Step title="If validation fails">
    The agent receives a **system message** containing the `output_info` from the guardrail
  </Step>
  <Step title="Agent retries">
    The agent generates a new response, informed by the error message
  </Step>
  <Step title="Repeat until success or limit reached">
    This cycle continues up to `validation_attempts` times
  </Step>
  <Step title="If all attempts fail">
    `OutputGuardrailTripwireTriggered` exception is raised
  </Step>
</Steps>

### Configuring Retry Attempts

```python
agent = Agent(
    name="CustomerSupportAgent",
    instructions="You are a helpful customer support agent.",
    output_guardrails=[response_content_guardrail],
    validation_attempts=2,  # Default is 1 (one retry)
)
```

**Settings:**
- `validation_attempts=0`: Fail-fast (no retries, immediate exception)
- `validation_attempts=1`: Default (one retry after initial failure)
- `validation_attempts=2+`: Multiple retries for complex validations

<Note>
Each retry sends the `output_info` message to the agent as a system message, giving the agent context to adjust its response.
</Note>

### Handling Validation Failures

After all validation attempts fail, handle the exception:

```python
from agency_swarm import OutputGuardrailTripwireTriggered

try:
    response = await agency.get_response("Hello!")
except OutputGuardrailTripwireTriggered as e:
    print(f"Validation failed: {e.guardrail_result.output_info}")
    # Implement fallback behavior or notify user
```

## Input Guardrails

Input guardrails validate incoming messages **before** they reach the agent. They screen both user input and inter-agent communication.

### Simplified Input Processing

Agency Swarm automatically extracts text content from messages, so your guardrails receive clean text instead of complex message structures. You don't need manual extraction logic.

### Function Signature

Each input guardrail receives three parameters:

```python
@input_guardrail
async def my_input_guardrail(
    context: RunContextWrapper,
    agent: Agent,
    user_input: str | list[str]
) -> GuardrailFunctionOutput:
    """Validate user input."""
    # Your validation logic here
    pass
```

**Parameters:**
- `context`: Run context wrapper with access to shared state
- `agent`: The Agent instance receiving the input
- `user_input`: Extracted text content
  - **Single message**: A string containing the message content
  - **Multiple consecutive messages**: A list of strings, one per message

**Return:**
- `GuardrailFunctionOutput` with:
  - `tripwire_triggered` (bool): `True` if validation failed
  - `output_info` (str): Guidance message returned to the caller

<Note>
File and image inputs inside messages are not passed to the guardrail.
</Note>

### Input Types

When a user sends multiple messages:
```json
[
  {"role": "user", "content": "Hi"},
  {"role": "user", "content": "How are you?"}
]
```

Your guardrail receives:
```python
["Hi", "How are you?"]
```

This allows you to process each new input message individually or validate them as a group.

### Basic Input Guardrail

```python
from agency_swarm import input_guardrail, GuardrailFunctionOutput, RunContextWrapper, Agent

@input_guardrail
async def require_task_prefix(
    context: RunContextWrapper, agent: Agent, user_input: str | list[str]
) -> GuardrailFunctionOutput:
    """Require user requests to begin with 'Request:'"""

    # Handle single string input
    text = user_input if isinstance(user_input, str) else " ".join(user_input)
    condition = not text.startswith("Request:")

    return GuardrailFunctionOutput(
        output_info="Prefix your request with 'Request:' describing what you need." if condition else "",
        tripwire_triggered=condition,
    )

agent = Agent(
    name="CustomerSupportAgent",
    instructions="You are a helpful customer support agent.",
    input_guardrails=[require_task_prefix],
)
```

## Friendly vs Strict Mode

Input guardrails support two modes that control how guardrail guidance is delivered: **friendly mode** (default) and **strict mode**. The `throw_input_guardrail_error` parameter controls this behavior.

### Friendly Mode (Default)

**Setting:** `throw_input_guardrail_error=False`

In friendly mode, guardrail guidance flows naturally as if it came from the agent itself:
- Guidance returned as `final_output` (non-streaming) or `message_output_created` event (streaming)
- No exceptions raised
- Persisted as an **assistant message** (`message_origin="input_guardrail_message"`)
- User experience stays fluid and conversational

**When to use:**
- Conversational flows where you want to guide users naturally
- Internal agents communicating with each other
- Cases where you want to provide helpful feedback without interrupting the flow

**Example:**
```python
agent = Agent(
    name="CustomerSupportAgent",
    instructions="You are a helpful customer support agent.",
    input_guardrails=[require_task_prefix],
    throw_input_guardrail_error=False,  # Friendly mode (default)
)

# Usage
response = await agency.get_response("Hello!")
print(response.final_output)
# Output: "Prefix your request with 'Request:' describing what you need."
# No exception raised - guidance returned directly
```

**Streaming behavior:**
```text
RunItemStreamEvent(
    name='message_output_created',
    item=MessageOutputItem(
        raw_item=ResponseOutputMessage(
            id='msg_input_guardrail_guidance',
            content=[ResponseOutputText(text="Prefix your request...")],
            role='assistant',
            status='completed'
        )
    )
)
```

### Strict Mode

**Setting:** `throw_input_guardrail_error=True`

In strict mode, guardrail failures abort the turn immediately:
- `InputGuardrailTripwireTriggered` exception raised
- Persisted as a **system message** (`message_origin="input_guardrail_error"`)
- Turn aborted (agent never processes the input)
- Caller must handle the exception

**When to use:**
- Hard requirements or compliance rules that cannot be bypassed
- Security validations that must block processing
- Cases where you want explicit exception handling

**Example:**
```python
from agency_swarm import InputGuardrailTripwireTriggered

agent = Agent(
    name="CustomerSupportAgent",
    instructions="You are a helpful customer support agent.",
    input_guardrails=[require_task_prefix],
    throw_input_guardrail_error=True,  # Strict mode
)

# Usage
try:
    response = await agency.get_response("Hello!")
except InputGuardrailTripwireTriggered as e:
    print(f"Validation failed: {e.guardrail_result.output_info}")
    # Output: "Validation failed: Prefix your request with 'Request:' describing what you need."
```

### Comparison Table

| Mode | `throw_input_guardrail_error` | Caller sees | Persisted entry | Role | Use case |
|------|-------------------------------|-------------|-----------------|------|----------|
| **Friendly** | `False` (default) | Guardrail text as `final_output` or streaming event | Assistant message (`input_guardrail_message`) | `assistant` | Conversational flows, helpful guidance |
| **Strict** | `True` | `InputGuardrailTripwireTriggered` exception | System message (`input_guardrail_error`) | `system` | Hard requirements, compliance, security |

### Decision Guide

<Accordion title="Should I use friendly or strict mode?">
**Use Friendly Mode when:**
- You want a conversational user experience
- Agents are communicating with each other internally
- Guardrail feedback is helpful guidance, not a hard block
- You don't want to write exception handling code

**Use Strict Mode when:**
- You're enforcing non-negotiable requirements
- Security or compliance rules must block processing
- You want explicit control over error handling
- The caller should know immediately that validation failed
</Accordion>

## Guardrails in Message History

Each guardrail trigger is recorded in the chat history with a guidance entry. Every entry carries a `message_origin` field that identifies which guardrail fired.

### Message Origin Values

- `input_guardrail_message`: Input guardrail in friendly mode
- `input_guardrail_error`: Input guardrail in strict mode
- `output_guardrail_error`: Output guardrail (always system message)

### Persistence Behavior

| Mode | `throw_input_guardrail_error` | Streaming Event | Persisted Entry |
|------|-------------------------------|-----------------|-----------------|
| **Friendly** | `False` (default) | `message_output_created` with guidance text | Assistant message, `message_origin="input_guardrail_message"` |
| **Strict** | `True` | `{"type": "error", "content": guidance}` | System message, `message_origin="input_guardrail_error"` |

Each triggered guardrail leaves exactly one guidance entry in the chat history:
- In **friendly mode**, that entry is an assistant message and its text matches what the caller receives
- In **strict mode**, the guardrail raises an exception and only the system guidance entry remains

<Note>
The `validation_attempts` parameter currently does not apply to input guardrails - they trigger immediately on validation failure.
</Note>

### Message History After Guardrails Trip

When an input guardrail trips, agent-to-agent messages (requests from calling agents) remain in history alongside the guardrail guidance. This preserves context so calling agents understand what they asked and can adjust their approach.

Output guardrail messages also persist in history to guide retry attempts.

<Accordion title="Example message history entries">

```json
[
    // Input guardrail triggered by user input in friendly mode (presented as assistant message)
    {
        "role": "assistant",
        "content": "Please, prefix your request with 'Support:' describing what you need.",
        "message_origin": "input_guardrail_message",
        "agent": "CustomerSupportAgent",
        "callerAgent": null,
        "agent_run_id": "agent_run_id",
        "timestamp": 1758103764049935,
        "type": "message",
    },

    // Input guardrail triggered within the agency in friendly mode (guidance returned inline)
    {
        "role": "assistant",
        "content": "When chatting with this agent, provide your name (which is Alice), for example, 'Hello, I'm Alice.' Adjust your input and try again.",
        "message_origin": "input_guardrail_message",
        "agent": "DatabaseAgent",
        "callerAgent": "CustomerSupportAgent",
        "agent_run_id": "agent_run_id",
        "parent_run_id": "call_id",
        "timestamp": 1758103766899061,
        "type": "message",
    },

    // Output guardrail triggered by an assistant response
    {
        "role": "system",
        "content": "You are not allowed to include your email address in your response. Ask agent to redirect user to the contact page: https://www.example.com/contact",
        "message_origin": "output_guardrail_error",
        "agent": "DatabaseAgent",
        "callerAgent": "CustomerSupportAgent",
        "agent_run_id": "agent_run_id",
        "parent_run_id": "call_id",
        "timestamp": 1758103770629217,
        "type": "message",
    },
]
```
</Accordion>

## Agent-to-Agent Validation

Use guardrails to control how agents communicate with each other. When adding communication flows between agents, the recipient agent's guardrails define the message format.

### Input and Output Guardrails for Inter-Agent Communication

```python
@input_guardrail(name="RequireTaskPrefix")
async def require_task_prefix(
    context: RunContextWrapper, agent: Agent, agent_input: str | list[str]
) -> GuardrailFunctionOutput:
    text = agent_input if isinstance(agent_input, str) else " ".join(agent_input)
    condition = not text.startswith("Task:")
    return GuardrailFunctionOutput(
        output_info="ERROR: Requests to this agent must begin with 'Task:'" if condition else "",
        tripwire_triggered=condition,
    )


@output_guardrail(name="RequireResponsePrefix")
async def require_response_prefix(
    context: RunContextWrapper, agent: Agent, response_text: str
) -> GuardrailFunctionOutput:
    condition = not response_text.startswith("Response:")
    return GuardrailFunctionOutput(
        output_info="ERROR: Your response must start with 'Response:'" if condition else "",
        tripwire_triggered=condition,
    )

ceo = Agent(
    name="CEO",
    instructions="You are the CEO agent.",
)

worker = Agent(
    name="Worker",
    instructions="You are the worker agent.",
    input_guardrails=[require_task_prefix],
    output_guardrails=[require_response_prefix],
    throw_input_guardrail_error=True,
)

agency = Agency(
    ceo,
    communication_flows=[(ceo, worker)],
)
```

In this example:
- If the CEO agent sends a message to the worker that doesn't start with "Task:", the input guardrail triggers
- The CEO receives an error message: `"ERROR: Requests to this agent must begin with 'Task:'"`
- The CEO adjusts its message and tries again (or notifies the user, per its instructions)

Similarly, the worker's output guardrail ensures responses start with "Response:". Within the configured `validation_attempts`, the worker must generate a correct response or the validation fails.

<Note>
Agent-to-agent messages are always single strings, so input guardrails for inter-agent communication always receive a string (not a list).
</Note>

### Recommended Mode for Internal Agents

It is recommended to use **friendly mode** (`throw_input_guardrail_error=False`) for the agency's internal agents. While **strict mode** (`True`) is also supported, friendly mode ensures that guardrail guidance flows naturally through the agent chain without raising exceptions that interrupt the communication flow.

<Warning>
Due to the nature of Handoffs, using `SendMessageHandoff` for agent-to-agent communication will bypass input guardrails set between agents.
</Warning>

## Best Practices

<CardGroup cols={2}>
  <Card title="Single Responsibility" icon="bullseye">
    Each guardrail should check one thing. Create multiple guardrails for different concerns instead of combining logic.
  </Card>

  <Card title="Specific Error Messages" icon="message">
    Provide clear, actionable feedback in `output_info`. Tell the agent or user exactly what to fix.
  </Card>

  <Card title="Use Judge Agents" icon="scale-balanced">
    For complex decisions (like relevance or tone), delegate to a specialized judge agent instead of hard-coding rules.
  </Card>

  <Card title="Test Independently" icon="vial">
    Test guardrails with various inputs to ensure they catch invalid cases and allow valid ones.
  </Card>

  <Card title="Balance UX vs Enforcement" icon="balance-scale">
    Consider the user experience - friendly mode for guidance, strict mode for hard blocks.
  </Card>

  <Card title="Start Simple" icon="seedling">
    Begin with basic checks and expand as needed. Overly complex guardrails can slow response time.
  </Card>
</CardGroup>

## See Also

<CardGroup cols={2}>
  <Card title="Tool Input Validation" icon="shield-check" href="/core-framework/tools/custom-tools/validation">
    Validate tool inputs using Pydantic validators
  </Card>

  <Card title="Agent Configuration" icon="gear" href="/core-framework/agents/advanced-configuration">
    Advanced agent configuration options
  </Card>

  <Card title="Agents SDK Guardrails" icon="link" href="https://openai.github.io/openai-agents-python/guardrails">
    Underlying guardrail implementation in OpenAI Agents SDK
  </Card>

  <Card title="Streaming" icon="signal-stream" href="/additional-features/streaming">
    Handle streaming events and responses
  </Card>

  <Card title="Examples" icon="code" href="https://github.com/VRSEN/agency-swarm/tree/main/examples">
    View complete guardrail examples on GitHub
  </Card>
</CardGroup>
